12 research outputs found
Content-Based Weak Supervision for Ad-Hoc Re-Ranking
One challenge with neural ranking is the need for a large amount of
manually-labeled relevance judgments for training. In contrast with prior work,
we examine the use of weak supervision sources for training that yield pseudo
query-document pairs that already exhibit relevance (e.g., newswire
headline-content pairs and encyclopedic heading-paragraph pairs). We also
propose filtering techniques to eliminate training samples that are too far out
of domain using two techniques: a heuristic-based approach and novel supervised
filter that re-purposes a neural ranker. Using several leading neural ranking
architectures and multiple weak supervision datasets, we show that these
sources of training pairs are effective on their own (outperforming prior weak
supervision techniques), and that filtering can further improve performance.Comment: SIGIR 2019 (short paper
Same but Different: Distant Supervision for Predicting and Understanding Entity Linking Difficulty
Entity Linking (EL) is the task of automatically identifying entity mentions
in a piece of text and resolving them to a corresponding entity in a reference
knowledge base like Wikipedia. There is a large number of EL tools available
for different types of documents and domains, yet EL remains a challenging task
where the lack of precision on particularly ambiguous mentions often spoils the
usefulness of automated disambiguation results in real applications. A priori
approximations of the difficulty to link a particular entity mention can
facilitate flagging of critical cases as part of semi-automated EL systems,
while detecting latent factors that affect the EL performance, like
corpus-specific features, can provide insights on how to improve a system based
on the special characteristics of the underlying corpus. In this paper, we
first introduce a consensus-based method to generate difficulty labels for
entity mentions on arbitrary corpora. The difficulty labels are then exploited
as training data for a supervised classification task able to predict the EL
difficulty of entity mentions using a variety of features. Experiments over a
corpus of news articles show that EL difficulty can be estimated with high
accuracy, revealing also latent features that affect EL performance. Finally,
evaluation results demonstrate the effectiveness of the proposed method to
inform semi-automated EL pipelines.Comment: Preprint of paper accepted for publication in the 34th ACM/SIGAPP
Symposium On Applied Computing (SAC 2019
Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling
This paper presents a Kernel Entity Salience Model (KESM) that improves text
understanding and retrieval by better estimating entity salience (importance)
in documents. KESM represents entities by knowledge enriched distributed
representations, models the interactions between entities and words by kernels,
and combines the kernel scores to estimate entity salience. The whole model is
learned end-to-end using entity salience labels. The salience model also
improves ad hoc search accuracy, providing effective ranking features by
modeling the salience of query entities in candidate documents. Our experiments
on two entity salience corpora and two TREC ad hoc search datasets demonstrate
the effectiveness of KESM over frequency-based and feature-based methods. We
also provide examples showing how KESM conveys its text understanding ability
learned from entity salience to search
Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval
Interactive Information Retrieval (IIR) and Reinforcement Learning (RL) share
many commonalities, including an agent who learns while interacts, a long-term
and complex goal, and an algorithm that explores and adapts. To successfully
apply RL methods to IIR, one challenge is to obtain sufficient relevance labels
to train the RL agents, which are infamously known as sample inefficient.
However, in a text corpus annotated for a given query, it is not the relevant
documents but the irrelevant documents that predominate. This would cause very
unbalanced training experiences for the agent and prevent it from learning any
policy that is effective. Our paper addresses this issue by using domain
randomization to synthesize more relevant documents for the training. Our
experimental results on the Text REtrieval Conference (TREC) Dynamic Domain
(DD) 2017 Track show that the proposed method is able to boost an RL agent's
learning effectiveness by 22\% in dealing with unseen situations.Comment: Accepted by SIGIR 202
Interoperable human behavior models for simulations
Modern simulations and games have limited capabilities for simulated characters to interact with each other and with humans in rich, meaningful ways. Although significant achievements have been made in developing
human behavior models (HBMs) that are able to control a single simulated entity (or a single group of simulated entities), a limiting factor is the inability of HBMs developed by different groups to interact with each other. We present an architecture and multi-level message framework for enabling HBMs to communicate with each other about their actions and their intents, and describe the results of our crowd control demonstration system which applied it to allow three distinct HBMs to interoperate within a single training-oriented simulation. Our hope is that this will encourage the development of standards for interoperability among HBMs which will lead to the development of richer training
and analysis simulations.Postprint (author’s final draft
Growth of Sobolev norms for the analytic NLS on T-2
We consider the completely resonant non-linear Schrödinger equation on the two dimensional torus with any analytic gauge invariant nonlinearity. Fix s>1. We show the existence of solutions of this equation which achieve arbitrarily large growth of Hs Sobolev norms. We also give estimates for the time required to attain this growth.Postprint (author's final draft
Growth of Sobolev norms for the analytic NLS on T-2
We consider the completely resonant non-linear Schrödinger equation on the two dimensional torus with any analytic gauge invariant nonlinearity. Fix s>1. We show the existence of solutions of this equation which achieve arbitrarily large growth of Hs Sobolev norms. We also give estimates for the time required to attain this growth
Interoperable human behavior models for simulations
Modern simulations and games have limited capabilities for simulated characters to interact with each other and with humans in rich, meaningful ways. Although significant achievements have been made in developing
human behavior models (HBMs) that are able to control a single simulated entity (or a single group of simulated entities), a limiting factor is the inability of HBMs developed by different groups to interact with each other. We present an architecture and multi-level message framework for enabling HBMs to communicate with each other about their actions and their intents, and describe the results of our crowd control demonstration system which applied it to allow three distinct HBMs to interoperate within a single training-oriented simulation. Our hope is that this will encourage the development of standards for interoperability among HBMs which will lead to the development of richer training
and analysis simulations
A multicenter study to evaluate pulmonary function in osteogenesis imperfecta
Pulmonary complications are a significant cause for morbidity and mortality in osteogenesis imperfecta (OI). However, to date, there have been few studies that have systematically evaluated pulmonary function in individuals with OI. We analyzed spirometry measurements, including forced vital capacity (FVC) and forced expiratory volume in the first second (FEV1 ), in a large cohort of individuals with OI (n = 217) enrolled in a multicenter, observational study. We show that individuals with the more severe form of the disease, OI type III, have significantly reduced FVC and FEV1 which do not follow the expected trends of the normal population. We also show that "normalization" of FVC and FEV1 using general population data to generate percent predicted values underestimates the pulmonary involvement in OI. Within each subtype of OI, we used linear mixed models to find potential correlations between FEV1 and FVC with the clinical variables including mobility, bisphosphonate use, and scoliosis. Our results are an important step in understanding the extent of pulmonary involvement in individuals with OI and for developing pulmonary endpoints for use in the routine patient care as well as in the investigation of new therapies